Add support for Optimisers.jl #114

lorenzoh · 2022-04-30T06:58:27Z

Closes #112 (once done). @ToucheSir @darsnack

So this is a first draft for adding Optimisers.jl support (new optims) while keeping compatibility with optimisers in Flux.Optimise (old optims).

Passing in new optims already works, but I've broken support for old optims. Before, FluxTraining.jl was using implicit parameters with Params and Grads objects. I'm not sure how to use the old optims with explicit parameters to gradient.

I'll leave some more questions next to the code changes, some feedback from you two would be much appreciated!

lorenzoh · 2022-04-30T07:01:13Z

src/training.jl

@@ -49,18 +49,29 @@ function step! end
 function step!(learner, phase::TrainingPhase, batch)
    xs, ys = batch
    runstep(learner, phase, (; xs=xs, ys=ys)) do handle, state
-        state.grads = gradient(learner.params) do
-            state.ŷs = learner.model(state.xs)
+        state.grads, _, _= gradient(learner.model, state.xs, state.ys) do model, xs, ys


I'm not sure if passing in all three is necessary here? Since we want the gradients of the model with respect to xs and ys, but I wonder if this calculates some unneeded gradients the other way around as well?

lorenzoh · 2022-04-30T07:01:57Z

src/training.jl

    end
 end

+# Handle both old Flux.jl and new Optimisers.jl optimisers
+function _update!(optimizer::Flux.Optimise.AbstractOptimiser, params, model, grads)
+    update!(optimizer, model, grads)


This currently throws an error. For context params isa Params and grads is no longer a Grads. Is a Params even needed anymore?

Left a comment above about this.

lorenzoh · 2022-04-30T07:02:15Z

test/training.jl

+
+
+@testset "Optimisers.jl compatibility" begin
+    learner = testlearner(coeff = 3, opt=Optimisers.Descent(0.001))


This already passes 👍 but all the old optim tests are broken for the time being

darsnack · 2022-04-30T14:30:21Z

src/training.jl

-        state.grads = gradient(learner.params) do
-            state.ŷs = learner.model(state.xs)
+
+        state.grads, = gradient(learner.model) do model


I think here you want to take the gradient w.r.t. learner.params when the optimizer is a Flux.Optimise.AbstractOptimiser. Conversely, if it is not, you take the gradient w.r.t. learner.model like you are now.

This is why update! below is erroring cause you need to Grads object for the old optimizers. And you can only get that with implicit params.

I think some dispatch for the gradient would be easiest. Another option is to have a utility that takes the model, the gradient w.r.t. it, and Params, then it produces a Grads to match.

Ah, I figured. Was hoping there may be a way to have the same Zygote.gradient call work but I guess not. I'll add a dispatch on the optimiser there.

Well what you're asking for might come eventually in a later version of Flux as part of the AD-agnostic push. So, the code might eventually get simpler.

That's good to know. Definitely let me know then, so I can clean this up again.

darsnack · 2022-04-30T14:30:36Z

src/training.jl

    end
 end

+# Handle both old Flux.jl and new Optimisers.jl optimisers
+function _update!(optimizer::Flux.Optimise.AbstractOptimiser, params, model, grads)
+    update!(optimizer, model, grads)


Left a comment above about this.

lorenzoh · 2022-04-30T16:06:51Z

@darsnack I added the dispatch for gradient. Can you take a quick look that it looks okay before I merge?

darsnack

Looks right to me!

lorenzoh added 2 commits April 30, 2022 08:53

Add Optimisers.jl dep

4c155ec

Add WIP Optimisers.jl support

4974657

lorenzoh commented Apr 30, 2022

View reviewed changes

Don't diff input and target

a22dff4

darsnack reviewed Apr 30, 2022

View reviewed changes

lorenzoh added 2 commits April 30, 2022 17:41

Add optimiser dispatch for gradient

903c243

Update test deps

63128f3

darsnack approved these changes Apr 30, 2022

View reviewed changes

Merge branch 'master' into lo/optimisers

4b1c052

lorenzoh merged commit 4c9e98b into master Apr 30, 2022

lorenzoh mentioned this pull request Apr 30, 2022

Add callback support for Optimisers.jl #115

Closed

ToucheSir mentioned this pull request Aug 1, 2022

Upgrade train! to work with explicit parameters FluxML/Flux.jl#2029

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Optimisers.jl #114

Add support for Optimisers.jl #114

lorenzoh commented Apr 30, 2022 •

edited

Loading

lorenzoh Apr 30, 2022

lorenzoh Apr 30, 2022

darsnack Apr 30, 2022

lorenzoh Apr 30, 2022

darsnack Apr 30, 2022

lorenzoh Apr 30, 2022

darsnack Apr 30, 2022

lorenzoh Apr 30, 2022

darsnack Apr 30, 2022

lorenzoh commented Apr 30, 2022

darsnack left a comment



		@testset "Optimisers.jl compatibility" begin
		learner = testlearner(coeff = 3, opt=Optimisers.Descent(0.001))

Add support for Optimisers.jl #114

Add support for Optimisers.jl #114

Conversation

lorenzoh commented Apr 30, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lorenzoh commented Apr 30, 2022

darsnack left a comment

Choose a reason for hiding this comment

lorenzoh commented Apr 30, 2022 •

edited

Loading